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Abstract 

In some scenarios there are ways of conveying information with many fewer, even exponentially fewer, 
qubits than possible classically [1], [2], [3]. Moreover, some of these methods have a very simple structure — 
they involve only few message exchanges between the communicating parties. It is therefore natural to 
ask whether every classical protocol may be transformed to a "simpler" quantum protocol — one that has 
similar efficiency, but uses fewer message exchanges. 

We show that for any constant k, there is a problem such that its k+1 message classical communication 
complexity is exponentially snialka' than its k message quantum communication complexity. This, in 
particular, proves a round hierarchy theorem for quantum communication complexity, and implies, via 
a simple reduction, an n{N^/'') lower bound for k message quantum protocols for Set Disjointness for 
constant k. 

Enroute, we prove information-theoretic lemmas, and define a related measure of correlation, the 
informational distance, that we believe may be of significance in other contexts as well. 

I. Introduction 

A recurring theme in quantum information processing has been the idea of exploiting 
the exponential resources afforded by quantum states to encode information in very non- 
obvious ways. One representative result of this kind is due to Ambainis, Schulman, Ta- 
Shma, Vazirani, and Wigderson [2]. They show that two players can deal a random set 
of \/iV cards each, from a pack of N cards, by the exchange of O(logiV) quantum bits 
between them. Another example is given by Raz [3] who shows that a natural geometric 
promise problem that has an efficient quantum protocol, is hard to solve via classical 
communication. Both are examples of problems for which exponentially fewer quantum 
bits are required to accomplish a communication task, as compared to classical bits. A 
third example is the 0{\/N\og N) qubit protocol for Set Disjointness due to Buhrman, 
Cleve, and Wigderson [1], which represents quadratic savings in the communication cost 
over classical protocols. 

The protocols presented by Ambainis et al. [2] and Raz [3] share the feature that they 
require minimal interaction between the communicating players. For example, in the 
protocol of Ambainis et al. [2] one player prepares a set of qubits in a certain state and 
sends half of the qubits across as the message, after which both players measure their qubits 
to obtain the result. In contrast, the protocol of Buhrman, Cleve and Wigderson [1] for 
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checking set disjointness (DISJ) requires il{y/N) messages. This raises a natural question: 
Can we exploit the features of quantum communication and always reduce interaction 
while maintaining the same communication cost? In particular, are there efficient quantum 
protocols for DISJ that require only a few messages? 

Kitaev and Watrous [4] show that every efficient quantum interactive proof can be trans- 
formed into a protocol with only three messages of similar total length. This suggests that 
it might be possible to reduce interaction in other protocols as well. In this paper we show 
that for any constant k, there is a problem such that its /c + 1 message classical commu- 
nication complexity is exponentially smaller than its k message quantum communication 
complexity, thus answering the above question in the negative. This, in particular, proves 
a round hierarchy theorem for quantum communication complexity, and implies, via a 
simple reduction, polynomial lower bounds for constant round quantum protocols for Set 
Disjointness. 

Our Separation Results 

The role of interaction in classical communication is well-studied, especially in the con- 
text of the Pointer Jumping function [5] , [6] , [7] , [8] , [9] . Our first result is for a subprob- 
lem Sk of Pointer Jumping that is singled out in Miltersen et al. [10] (see Section V-A for 
a formal definition of Sk)- We show: 

Theorem I.l: For any constant /c, there is a problem Sk+i such that any quantum pro- 
tocol with only k messages and constant probability of error requires Vl[N^/^^^^^) commu- 
nication qubits, whereas it can be solved with k-\-l messages by a deterministic protocol 
with O(logA^) bits. 

A more precise version of this theorem is given in Section V-D and implies a round 
hierarchy even when the number of messages k grows as a function of input size A^, up 
to k = ©(log A^/ log log A^). Our analysis of Sk follows the same intuition as that behind 
the result of Miltersen et al. [10], but relies on entirely new ideas from quantum information 
theory. The resulting lower bound is optimal for a constant number of rounds. 

Next, we study the Pointer Jumping function itself. Let fk denote the Pointer Jumping 
function with path length A; -|- 1 on graphs with 2n vertices, as defined in Section VI. 
The input length for the Pointer Jumping function fk is N — 2nlogn, independent of k, 
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whereas the input length for Sk is exponential in k. The function is thus usually more 
appropriate for studying the effect of rounds on communication when k grows rapidly as 
a function of the input length. 

We first show an improved upper bound on the classical complexity of Pointer Jumping, 
further closing the gap between the known classical upper and lower bounds. We then 
turn into proving a quantum lower bound. We prove: 

Theorem 1.2: For any constant A;, there is a classical deterministic protocol with k mes- 
sage exchanges, that computes with O(logn) bits of communication, while any k — 1 
round quantum protocol with constant error for needs qubits communication. 

The lower bound of Theorem 1.2 decays exponentially in k, and leads only to separation 
results for k = 0{\ogN). We believe it is possible to improve this dependence on fc, but 
leave it as an open problem. Note that in the preliminary version of this paper [11] this 
decay was even doubly exponential, and the improvement here is obtained by using a 
quantum version of the Hellinger distance. 

Our lower bounds for Sk and Pointer Jumping also have implications for Set Disjointness. 
The problem of determining the quantum communication complexity of DISJ has inspired 
much research in the last few years, yet the best known lower bound prior to this work 
was r2(logn) [2], [12]. We mentioned earlier the protocol of Buhrman et al. [1] which 
solves DISJ with 0{-\fN\ogN) qubits and VI[\/N) messages. Buhrman and de Wolf [12] 
observed (based on a lower bound for random access codes [13], [14]) that any one message 
quantum protocol for DISJ has linear communication complexity. We describe a simple 
reduction from Pointer Jumping in a bounded number of rounds to DISJ and prove: 

Corollary 1.3: For any constant A;, the communication complexity of any /c-message 
quantum protocol for Set Disjointness is Vl{N'^/^). 

A model of quantum communication complexity that has also been studied in the lit- 
erature is that of communication with prior entanglement (see, e.g., Refs. [15], [12]). In 
this model, the communicating parties may hold an arbitrary input-independent entangled 
state in the beginning of a protocol. One can use superdense coding [16] to transmit n 
classical bits of information using only \n/2\ qubits when entanglement is allowed. The 
players may also use measurements on EPR-pairs to create a shared classical random key. 
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While the first idea often decreases the communication complexity by a factor of two, the 
second sometimes saves logn bits of communication. It is unknown if shared entangle- 
ment may sometimes decrease the communication more than that. Currently no general 
methods for proving super-logarithmic lower bounds on the quantum communication com- 
plexity with prior entanglement and unrestricted interaction are known. Our results all 
hold in this model as well. 

Our interest in the role of interaction in quantum communication also springs from the 
need to better understand the ways in which we can access and manipulate information 
encoded in quantum states. We develop information-theoretic techniques that expose 
some of the limitations of quantum communication. We believe our information-theoretic 
results are of independent interest. 

The paper is organized as follows. In Section 11 we give some background on classical 
and quantum information theory. We recommend Preskill's lecture notes [17] or Nielsen 
and Chuang's book [18] as thorough introductions into the field. In Section III we present 
new lower bounds on the quantum relative entropy function (Section III- A) and introduce 
the informational distance (Section III-B). In Section IV we explain the communication 
complexity model, followed by Section V where we prove our separation results and the 
reduction to Set Disjointness (Section V-C). In Section VI we give our new upper bound 
(Section VI-B) and quantum lower bound (Section VI-C) for the pointer-jumping problem. 

Subsequent Results 

Subsequent to the publication of the preliminary version of this paper [11] several new 
related results have appeared. First, Razborov proves in Ref. [19] that the quantum 
communication complexity of the Set Disjointness problem is indeed Q(-\/N), no matter 
how many rounds are allowed. An upper bound of 0{\/N) is given by Aaronson and 
Ambainis [20]. A result by Jain, Radhakrishuan, and Sen in Ref. [21] shows that the 
complexity of protocols solving this problem in k rounds is at least Q{n/k'^). The same 
authors show in Ref. [22] that quantum protocols with k — 1 rounds for the Pointer Jumping 
function have complexity Q(n//c^), but this result seems to hold only for the case of 
protocols without prior entanglement. The same authors [23] also consider the complexity 
of quantum protocols for the version of the Pointer Jumping function, in which not only 
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one bit of the last vertex has to be computed, but its full name. Several papers ([24], [25], 
[21], [22], [26]) have used the information theoretic techniques developed in the present 
paper. 

In this paper, we improve the dependence of communication complexity lower bounds 
on the number of rounds, as compared to our results in Ref. [11]. To achieve this, we use a 
different information-theoretic tool based on the quantum Hcllingcr distance. The version 
of our Average Encoding Theorem based on Hellinger distance was independently found 
by Jain et al. [21]. 

II. Information Theory Background 

The quantum mechanical analogue of a random variable is a probability distribution 
over superpositions, also called a mixed state. For the mixed state X = {pj, where 
|(/>j) has probability pj, the density matrix is defined as px = '^i'Pi\4'i){4'i\- Density 
matrices are Hermitian, positive semi- definite, and have trace 1. I.e., a density matrix has 
an eigenvector basis, all the eigenvalues are real and between zero and one, and they sum 
up to one. 

A. Trace Norm And Fidelity 

The trace norm of a matrix A is defined as || ^4 ||^ = Tr -n/aTA, which is the sum of the 
magnitudes of the singular values of A. Note that if p is a density matrix, then it has 
trace norm one. If 0i, 02 are pure states then: 

|||0l)(0l|-|02)(02|||t = 2^1 -I (01 I 02) I'. 

We will need the following consequence of Kraus representation theorem (see for example 
Preskill's lecture notes [17]): 

Lemma II. 1: For each Hermitian matrix p and each trace-preserving completely positive 

superoperator T: ||T(p)||t < ||p||t- 

A useful alternative to the trace metric as a measure of closeness of density matrices is 
fidelity. Let p be a mixed state with support in a Hilbert space Ti. A purification of p is 
any pure state |0) in an extended Hilbert space <8) /C such that Tr^: |0)(0| = p. Given 
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two density matrices pi,p2 on the same Hilbert space 7i, their fidelity is defined as 

F{pl,p2) = sup I (01 I 02)1% 

where the supremum is taken over all purifications of pi in the same Hilbert space. 
Jozsa [27] gave a simple proof, for the finite dimensional case, of the following remarkable 
equivalence first established by Uhlmann [28]. 

Fact II. 2 (Jozsa) For any two density matrices pi,p2 on the same finite dimensional 
space 

F(pi,p2) = [Tr(vp^VV^^)]' = IIVpTv^K- 
Using this equivalence, Fuchs and van de Graaf [29] relate fidelity to the trace distance. 
Fact II. 3 (Fuchs, van de Graaf) For any two mixed states pi, P2, 

1 - V^(pi,p2) < ^I|pi-P2|lt < Vl-^(Pi,P2). 
While the definition of fidelity uses purifications of the mixed states and relates them 
via the inner product, fidehty can also be characterized via measurements (see Nielsen and 
Chuang [18]). 

Fact II. 4: For two probability distributions p, q on finite sample spaces, let F{p, q) — 
(X^i -s/PiQiV denote their fidelity. Then, for any two mixed states pi, P2, 

F{pi,p2) = min F{pm,qm), 

{Em} 

where the minimum is over all POVMs {Em}, and = TT{piEm),qm = Tr(p2-E'm) are 
the probability distributions created by the measurement on the states. 

A useful property of the trace distance jj pi — p2 ||t a measure of distinguishability is 
that it is a metric, and hence satisfies the triangle inequality. This is not true for fidelity 
F{pi, P2) or for 1 — F(p, P2). Fortunately, a variant of fidehty is actually a metric. Denote 

by 



h{pup2) = V 1 - \/F{puP2) 

the quantum Hellinger distance. Clearly h (pi, P2) inherits most of the desirable properties 
of fidelity, like unitary invariance, definability as a maximum over all measurements of the 
classical HeUinger distance of the resulting distributions, and so on. To see that h (pi, P2) 
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is actually a metric one can simply use Fact II. 4 to reduce this problem to showing that 
the classical Hellinger distance is a metric, which is well known. 

Analogously to Lemma II. 1, due to the monotonicity of fidelity [18], we have: 

Lemma II. 5: For all density matrices pi , p2 and each trace- preserving completely posi- 
tive supcroperator T: h (T{pi),T{p2j) < h{pi,p2). 

Let us also note the following relation between the Hellinger distance and the trace 
norm that follows directly from Fact II. 3. 

Lemma II. 6: For any two mixed states pi, P2, 

h'^{pi,p2) < ^||Pi-P2|lt < V2-h{pi,p2). 
We will sometimes work with {■,■) instead of h {■,■). This is not a metric, but it is 
true that for all density matrices pi, P2, Ps' 

h'' (pi, P2) < {h (pi, P3) + h (p3, P2)f < '^h? (pi, Pa) + (p3, P2) ■ 

B. Local Transition Between Bipartite States 
Jozsa [27] proved: 

Theorem II. 7 (Jozsa) Suppose |0i) , |02) & V.® K, are the purifications of two density 
matrices pi,p2 in 'H. Then, there is a local unitary transformation U on K, such that 

i^(Pl,P2) = |(0l|(/®f/) 102)1'. 

As noticed by Lo and Chau [30] and Mayers [31], Theorem II. 7 immediately implies 
that if two states have close reduced density matrices, than there exists a local unitary 
transformation transforming one state close to the other. Formally, 

Lemma II. 8: (Local Transition Lemma, based on Refs. [30], [31], [27], [29]) Let pi,p2 
be two mixed states with support in a Hilbert space Ti.. Let K, be any Hilbert space of 
dimension at least dim(7Y), and any purifications of pi'mTi® /C. 

Then, there is a local unitary transformation U on IC that maps \4>2) to \4>2) = I®U |02) 
such that 

^(|0l)(0lM</'2)(02l) = ^(Pl,P2)- 

Furthermore, 

iii0i)(0ii-i02)(02iiit < npi-p2\\h 
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Proof: (Of Lemma II. 8): By Theorem II. 7, there is a (local) unitary transformation U 
on /C such that (/(8)t/)|02) = |02)> a state which achieves fidelity: F{pi,p2) — |(0i|02)|^. 
Hence the statement about the Hellinger distance holds. 
By Lemma II. 6 

II - |02)(02l lit 

< 2a/2-/i(|0i)((/.i|,|</.^)((/.^|) 

= 2\/2-/i(pi,p2) 
1 

< 2 • II pi - p2 lit ■ 

■ 

C. Entropy, Mutual Information, And Relative Entropy. 

H{-) denotes the binary entropy function H{p) — p\og{^) + {l—p) log(Y^). The Shan- 
non entropy S{X) of a classical random variable X on a finite sample space is ^^Px log(^) 
where Px is the probability the random variable X takes value x. The mutual infor- 
mation I{X : y) of a pair of random variables X, Y is defined to be I{X : Y) = 
H[X) + H{Y) — H{X,Y). For other equivalent definitions, and more background on 
the subject see, e.g., the book by Cover and Thomas [32]. 

We use a simple form of Fano's inequality. 

Fact II. 9 (Fano's inequality) Let X be a uniformly distributed Boolean random vari- 
able, and let y be a Boolean random variable such that Prob(X — Y) — p. Then I{X : 

Y) > l-H{p). 

The Shannon entropy and the mutual information functions have natural generalizations 
to the quantum setting. The von Neumann entropy S{p) of a density matrix p is defined 
as S{p) — — Trplogp = — AjlogAj, where {Aj} is the multi-set of all the eigenvalues 
of p. Notice that the eigenvalues of a density matrix form a probability distribution. In 
fact, we can think of the density matrix as a mixed state that takes the i'th eigenvector 
with probability A,. The von Neumann entropy of a density matrix p is, thus, the entropy 
of the classical distribution p defines over its cigcnstates. 

The mutual information I{X : Y) of two disjoint quantum systems X, Y is defined to 
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be /(X : Y) = S{X) + S{Y) — S{XY), where XY is the density matrix of the system that 
includes the qubits of both systems. Then 

I{X : YZ) = I{X : Y) + I{XY : Z) - I{Y : Z), (1) 
I{X : YZ) > I{X : Y), (2) 

Equation (2) is in fact equivalent to the strong sub-additivity property of von Neumann 
entropy. 

We need the following slight generahzation of Theorem 2 in Cleve et al. [15]. 

Lemma 11.10: Let Alice own a state pA of a register A. Assume Alice and Bob com- 
municate and apply local transformations, and at the end register A is measured in the 
standard basis. Assume Alice sends Bob at most k qubits, and Bob sends Alice arbitrarily 
many qubits. Further assume all these local transformations do not change the state of 
register A, if A is in a classical state. Let pab be the final state of A and Bob's private 
qubits B. Then I{A : B) < 2k. 

Proof: Considering the joint state of register A and Bob's qubits, there cannot be any 
interference between basis states differing on A. Thus we can assume that pA is measured 
in the beginning, i.e., that pa is classical. In this case the result directly follows from 
Theorem 2 in Ref. [15]. ■ 

Note that in the above lemma Alice and Bob can use Bob's free communication to set 
up an arbitrarily large amount of entanglement independent of pA- 

The relative von Neumann entropy of two density matrices, defined by S{p\\a) — 
Trplogp — Trplogo". One useful fact to know about the relative entropy function is 
that I{A : B) = S[pab\\pa ® ps)- For more properties of this function see Refs. [17], [18]. 

in. Informational Distance And New Lower Bounds On Relative 

Entropy 

A. New Lower Bounds On Relative Entropy 

We now prove that the relative entropy >S'(pi||p2) is lower bounded by pi — p2 
and by fl{h^{pi, P2)). We believe these results are of independent interest. A classical 
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version of the theorem can be found in, e.g., Cover and Thomas' book on Information 
Theory [32]. 

Theorem III.l: For all density matrices pi, P2' 

1 2 

'^(Pl||P2) > II ^1 ~ ^2 lit ■ 

Although this relationship has appeared in the literature [33], it was rediscovered by 
several authors, including us. Below we give a proof of this theorem for completeness. The 
earlier version of our paper [11] contained a more complicated proof. 

Proof: (Theorem III.l) The proof goes by reduction to the classical case. Consider 
the classical distributions pi, P2 obtained by measuring pi, P2 in the basis diagonalizing 
their difference pi — P2- It is known [17], [18] that 

II Pi -P2 111 = II Pi- P2 lit- 

Due to Lindblad-Uhlmann monotonicity of relative von Neumann entropy [17], [18], 

'S'(pi||p2) > -S'(pi||p2)- 
The classical version of the theorem [32] now gives 

^^^^11^^^ - 21^ II ^^"^^11' 

1 II l|2 

- 2hr2 II " II* • 

This completes the proof. ■ 
Now we show an analogous result for the quantum Hellinger distance. 
Theorem III. 2: For all density matrices pi, P2- 

2 

^(Pl||P2) > —h^{pi,P2)- 

This theorem has also been shown independently by Jain et al. [21]. 

Proof: We first show that the theorem holds when pi and p2 are classical distribu- 
tions, and then generalize this to the quantum case. 

In the classical case we first show 5'(pi||p2) > — 21og(l — (^1,^2))- This was shown 
by Dacunha-Castelle in Ref. [34]. 
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log(l-/l^(pi,P2)) = log(V^(pi,P2)) 

= log XI a/M0P2W 



> 



log 2^piW^7=Y 



Epi(«)iog 



-^-^(^111^2)- 



The first equation is by definition of h, the second by definition of the classical fidehty 
function, and the inequality is by an application of Jensen's inequality 

Having that, S{pi\\p2) > {pi, P2) using — ln(l — > x for all < a; < 1 and so 

the theorem holds in the classical case. 

To show the quantum case recall that both /i (•, •) and S{-\\-) can be defined as the max- 
imum over all POVM measurements of the classical versions of these functions on the dis- 
tributions obtained by the measurements. Fix a POVM {E^} that maximizes h (p, q) for 
the distributions p, q obtained from pi, P2- Then S{pi\\p2) > S{p\\q) by Lindblad-Uhlmann 
monotonicity, and S{p\\q) > ]^h?{p,q) = {pi, P2) because h{p,q) = h{pi,p2). The 

result follows. ■ 

B. Informational Distance 

Prom Theorem III. 2 follows that for a bipartite state pab, 

2 

I{A:B) = S{pab\\pa^ Pb) > {pab.Pa® Pb) ■ 

Thus the distance between the tensor product state and the "real" (possibly entangled) 
bipartite state can be bounded in terms of the Hellinger distance. We call the quantity 
D{A : B) — h{pAB, Pa® Pb) the "informational distance." D{A : B) measures the 
amount of correlation between the quantum registers A and S, and can be positive even 
when the system is classical or not entangled. Later we state some of its properties and use 
it for proving the quantum communication lower bound on the pointer jumping problem. 
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The next lemma collects a few immediate properties of informational distance. 
Lemma III. 3: For all states pxYZ the following hold: 

1. D{X : Y) = D{Y : X), 

2. < D{X : y) < 1, 

3. D{X : Y) > h{T{pxY)-,T{px ® Py)) for all completely positive, trace-preserving su- 
peroperators T, 

4. D{XY : Z) > D{X : Z), 

5. D{X : Y) < y/I{X : Y). 

Proof: (1) is true by definition, (2) follows from the definition and the triangle 
inequality, (3,4) follow from Lemma II. 5 and (5) from Theorem III. 2. ■ 

We now examine the informational distance in the special case where pqx is block 
diagonal, with classical px- We denote by pg-* the density matrix obtained by fixing X to 
some classical value x and normalizing. Pr{x) is the probability oi X = x. 

Lemma IIL4: For all block diagonal pgx, where px corresponds to a classical distribu- 
tion, 

1. D'(Q:X)^B, (p'S^Pq)- 

2. Further assume X is Boolean with Pr(X = 1) = Pr(X = 0) = 1/2. Let there be a 
measurement acting on the Q system only, yielding a Boolean random variable Y with 
Pr(X = r) > 1 - e and Fr{X Y) < e. Then D^{Q : X) > 1/8 - e/2. 

The first item is true because pgx is block-diagonal with respect to X. In the second item, 
notice that the same measurement applied to px <S> pq yields a distribution with Pr(X = 
Y) — Pr(X 7^ Y) — 1/2, because Q is independent of X, and X is uniform. Observe 
that II pxQ — Px ® pQ lit > II PXY — Px ® Py lit ^ 1 ~ 2e and then apply Lemma II. 6. Note 
that this is a rather crude estimate, since D[Q : X) approaches 1 — l/\/2 when e goes to 
zero. 

C. The Average Encoding Theorem 

A corollary of Theorems 111.1,111.2 is the following "Average encoding theorem": 
Theorem IILS (Average encoding theorem) Let x i— > p^, be a quantum encoding map- 
ping an m bit string x e {0, 1}"* into a mixed state with density matrix p^. Let X be 
distributed over {0, 1}'", where x e {0, 1}"* has probability Px, let Q be the encoding of X 
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according to this map, and let p = YlixP^Px- Then, 

Y.P-\\p-P-\\t ^ [(21n2)/(Q:X)]^/' 

X 

and 

In other words, if an encoding Q is only weakly correlated to a random variable X, then 
the "average encoding" p is in expectation (over a random string) a good approximation 
of any encoded state. Thus, in certain situations, we may dispense with the encoding 
altogether, and use the single state p instead. The preliminary version of our paper [11] 
did not include the second statement. The present stronger version was also observed 
independently by Jain et al. [21]. 

Proof: (Of Theorem 111.5) In the setting of the Average encoding theorem we have 
a random variable that is distributed over {0, 1}™, and a quantum encoding x ^ p^ 
mapping m bit strings x e {0, 1}"* into mixed states with density matrices p^- Let X be 
the register holding the input x and Q be the register holding the encoding. Let us also 
define the average encoding p — ^^PxPx- 

Then, by Theorem III.l, 



I{Q:X) = S{pqx\\pq(^ Px) > II Pqx - Pq ® Px lit 

The density matrix px of the X register alone is diagonal and contains the values 
Px on the diagonal, the density matrix pg of the Q register alone is p, and the density 
matrix pg px is block diagonal and the x'th block is of the form p^p. Also, the density 
matrix pqx of the whole system is block diagonal, with PxPx in the x'th block. Thus, 
II Pqx - Pq® Px lit = HxP^ II Px - P lit, and so || Px - P |lt < ^/2\n2^/T{QTX). 

The second statement follows analogously using Theorem III.2. ■ 

IV. The Communication Complexity Model 

In the quantum communication complexity model [35] , two parties Alice and Bob hold 
qubits. When the game starts Ahce holds a classical input x and Bob holds y, and so 
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the initial joint state is simply \x) (E) \y). Furthermore each player has an arbitrarily large 
supply of private qubits in some fixed basis state. The two parties then play in turns. 
Suppose it is Alice's turn to play. Alice can do an arbitrary unitary transformation on 
her qubits and then send one or more qubits to Bob. Sending qubits does not change 
the overall superposition, but rather changes the ownership of the qubits, allowing Bob 
to apply his next unitary transformation on the newly received qubits. Alice may also 
(partially) measure her qubits during her turn. At the end of the protocol, one player 
makes a measurement and declares the result of the protocol. In a classical probabilistic 
protocol the players may only exchange classical messages. 

In both the classical and quantum settings we can also define a public coin model. 
In the classical public coin model the players are also allowed to access a shared source 
of random bits without any communication cost. The classical public and private coin 
models are strongly related [36]. Similarly, in the quantum public coin model Alice and 
Bob initially share an arbitrary number of quantum bits which are in some pure state 
that is independent of the inputs. This is better known as communication with prior 
entanglement [15], [12]. 

The complexity of a quantum (or classical) protocol is the number of qubits (respectively, 
bits) exchanged between the two players. We say a protocol computes a function / : 
X X y {0, 1} with e > error if, for any input x E X ,y & y, the probability that the 
two players compute f{x,y) is at least 1 — e. Qe{f) (resp. Re{f)) denotes the complexity 
of the best quantum (resp. probabilistic) protocol that computes / with at most e error. 
For a player P e {Alice, Bob}, Qf^{f) denotes the complexity of the best quantum 
protocol that computes / with at most e error with only c messages (called rounds in the 
literature), where the first message is sent by P. If the name of the player is omitted 
from the superscript, either player is allowed to start the protocol. We say a protocol V 
computes f with e error with respect to a distribution /i on X x y, ii 

Prob(^,y)e^,p(P(a;, y) = f{x, y)) > 1 - e. 

Q^l^if) is the complexity of computing / with at most e error with respect to /i, with 
only c messages where the first message is sent by player P. We will use the notation Q 
(rather than Q*, as in the literature) for communication complexity in the public coin 
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model. In all the above definitions, we may replace with U when ji is the uniform 
distribution over the inputs. 
The following is immediate. 

Fact IV. 1: For any distribution ji, number of messages c and player P, Q^i^if) < 

We put two constraints on protocols in the above definitions: 

• Wc assume that the two players do not modify the qubits holding the classical input 
during the protocol. This does not affect the aspect of communication we focus on in this 
paper. 

• We demand that the length of the i'th message sent in a protocol is known in advance. 
This restriction is also implicit in Yao's definition of quantum communication complexity 
using interacting quantum circuits [35]. 

To illustrate this, think of a public coin classical protocol in which Alice first looks at 
a public coin and if the coin is "head" sends in the first round a message of c qubits and 
in the second round a message of 1 qubit, otherwise she sends one qubit in the first round 
and c qubits in the second. In such a protocol the number of message bits sent in the first 
round is not known in advance, and so such a protocol is not allowed in our model. 

A k round protocol with communication complexity c in the more general model, in 
which the restriction above is absent, can be simulated in our model losing a factor of k in 
the communication complexity. To show this one invokes the principle of safe storage. The 
principle says that instead of a mixed state depending on measurement results, we may 
have a superposition over the measurement results and the messages. Note that in such a 
superposition there may be messages of different lengths (augmented by some blanks). In 
the worst case, the length of a single message is now c, so the overall communication cost 
is at most kc, and the number of rounds used is always the worst case number of rounds. 
In the example above we get a 2c communication complexity. 

V. The Role Of Interaction In Quantum Communication 

In this section, we prove that allowing more interaction between two players in a quan- 
tum communication game can substantially reduce the amount of communication required. 
In Section V-A we define a communication problem and formally state our results (giving 
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an overview of the proof), then in Section V-B wc give the details of the proofs. For 
the most part, we will concentrate on communication in a constant number of rounds. 
Section V-C describes the apphcation to the disjointness problem. Section V-D discusses 
our results in the case where the number of messages grows as a function of the input size. 

A. The Communication Problem And Its Complexity 

We define a sequence of problems Si, S2, ■ ■ ■ , Sk, ■ ■ ■ by induction. The problem is 
the index function, i.e., Alice has an n-bit string x ^ Xi = {0, 1}", Bob has an index i G 
^1 = [n] and the desired output is Si{x,i) = Xi. Suppose we have already defined the 
function : A'fe-i x yk-i {0, 1}. In the problem S^, Alice has as input her part 
of n independent instances of Sk-i, i.e., x e '^k-ii Bob has his share of n independent 
instances of S^^i, i.e., y e and in addition, there is an extra input a e [n] which is 

given to Alice if k is even and to Bob if k is odd. The output we seek is the solution to 
the a'th instance of Sk-i- In other words, Sk{xi, . . . ,Xn,a,yi, ■ ■ ■ ,yn) = Sk^i{xa,ya)- 

Note that the size of the input to the problem Sk is N = 0(n^). If we allow k message 
exchanges for solving the problem, it can be solved by exchanging Q{logN) — 0(/clogn) 
bits: ior k — 1, Bob sends Alice the index i and Ahce then knows the answer; for A; > 1, 
the player with the index a sends it to the other player and then they recursively solve 
for Sk-i{xa,ya)- However, we show that if we allow one less message, then no quantum 
protocol can compute Sk as efficiently. In fact, no quantum protocol can compute the 
function as efficiently even if we allow error, and only require small probability of error on 
average. 

Theorem V.l: For all constant A; > 1 and < e < | we have 

Q'u,.{Sk+i) = ^(iVVC^+i)). 
To prove this theorem we prove a stronger intermediate claim. Let Pi be Bob, and 
for A; > 2, let Pk denote the player that holds the index a in an instance of Sk {a indicates 
which of the n instances of Sk-i to solve). Let Pk denote the other player. We refer to Pk 
as the "wrong" player to start a protocol for Sk- The stronger claim is that any k message 
protocol for Sk in which the wrong player starts is exponentially inefficient as compared 
to the log protocol described above. 
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Lemma V.2: For all constant k > 1 and < e < | we have Qu^''{Sk) — fl{n) — 

Indeed, there is a classical /c-message, 0(n)-bit protocol in which the wrong player starts, 
so our lower bound is optimal. 

Theorem V.l now follows directly. 

Proof: (Of Theorem V.l): It is enough to show the lower bound for the two cases 
when the protocol starts either with Pk+i or with the other player. 

Let Pk+i be the player to start. Note that if we set a to a fixed value, say 1, then we 
get an instance of Sk- So Qu,+'{Sk) < Q^^'+'iSk+i). But Pk+i — Pki so the bound of 
Lemma V.2 applies. 

Let player Pk+i be the one to start. Then, observe that if we allow one more message 
(i.e., k+1 messages in all), the complexity of the problem only decreases: Q^^'^''^^ {Sk+i) < 
Q'lfj'^^{Sk+i)- So we again get the bound from Lemma V.2. ■ 

We prove Lemma V.2 by induction. First, we show that the index function is hard to 
solve with one message if the wrong player starts. This essentially follows from the lower 
bound for random access codes [13], [14]. The only difference is that we seek a lower bound 
for a protocol that has low error probability on average rather than in the worst case, so 
we need a refinement of the original argument. Wc give this in the next section. 

Lemma V.3: For any < e < 1 we have Q]}f^{Si) > i(l - H{e))n. 

Next, we show that if we can solve Sk with k messages with the wrong player starting, 
then we can also solve Sk-i with only k — 1 messages of smaller total length, again with 
the wrong player starting, at the cost of a slight increase in the average probability of 
error. 

Lemma V.^: For k > 2 and < e < |, let P be any protocol that solves Sk with 
respect to the uniform distribution U with error e, and k messages starting with Pk- Let 
the communication complexity oi V he £ — £i + i with £i being the length of the first 
message sent. Then, Q''^~y''-\Sk-i) < £, where e' = e + 2(£i/n) ^2. 
We defer the proof of this lemma to a later section, but show how it imphes Lemma V.2 
above. 

Proof: (Of Lemma V.2): We prove the lemma by induction on k. The case /c = 1 is 
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handled by Lemma V.3. Suppose the statement holds for A; — 1. We prove by contradiction 
that it holds for k as well, li £ — Q^^''{Sk) — o{n), then by Lemma V.4 there is a /c — 1 
message protocol for Sk-i with the wrong player starting, with error e' — e + o{l) < |, 
and with communication complexity at most i — o{n). This contradicts the induction 
hypothesis. ■ 

B. The Key Lemmas 

We now prove average case hardness of the index function. 

Proof: (Of Lemma V.3): Consider any protocol for 5*1 with Alice sending the first 
(and only) message. Let ej be the probability of error when the input to Alice is uniformly 
random but the input to Bob is i. Note that e = X^i^/^- Let X denote the random 
variable containing Alice's input, and let Mb denote the qubits held by Bob after he 
has received Alice's message, including his part of the shared entangled state. Prom 
Properties (1) and (2) of mutual information in Section II-C, and the concavity of binary 
entropy, 

I{X : Mb) > Y^HXi : Mb) > ^(1 - i/(e,)) > n{l - H{e)). 

i i 

The second inequality follows from the fact that Bob has a measurement that predicts Xi 
with error ej and Fact II. 9 (Fano's inequality). On the other hand, I{X : Mb) is bounded 
above by twice the number of qubits in the message [15, Theorem 2]. The lemma follows. 

■ 

Note that for public-coin randomized protocols we do not have the factor of |, and 
obtain a lower bound of n(l — H{e)). 

Next, we show how an efficient protocol for Sk gives rise to an efficient protocol for Sk-i- 
The intuition behind the argument is the same as in proofs for classical communication [10], 
[36]. However, we use entirely new techniques from quantum information theory, as de- 
veloped in Section III and also get better bounds. 

Proof: (Of Lemma V.4): For concreteness, we assume that k is even, so that Pk is 
Bob. Let P be a protocol that solves Sk with respect to the uniform distribution U with 
error e, k messages starting with Bob. Let the communication complexity oiV he £ = £i+£ 
with £i being the length of the first message sent. 
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Given the protocol V, we devise a protocol V' for solving Sj-^i with respect to the 
uniform distribution, but with Alice starting, and with only k — 1 messages. The intuition 
behind the protocol V' is the following. It first tries to recreate, from some shared prior 
entanglement, the state after the first message in the run of on a specially chosen Sk 
instance, and then simulates the remaining k—\ rounds of communication of the protocol V 
on the recreated state. The instance of Sk is such that the solution to that instance 
coincides with the solution to the given Sk-i instance. We thus get a protocol for Sk-\ 
with the desired properties. The details follow. 

We start by describing the joint pure state that Alice and Bob share in V' prior to 
being given the inputs to the problem Sk-\- Consider the protocol V computing Sk- Let 
Ma, Mb be the private qubits (or "registers") held by Alice and Bob respectively. Let 
Y = ^112 ■ ■ ■ denote the register containing the input to Bob. Consider the state \x) of 
the registers M^MbY , after Bob sends the first message in P, when Y is initialized to a 
uniform superposition over = ^^-i- The prior entanglement that Alice and Bob share 
in V' is then defined as 

1 " 

-7^ \j) a\x) Ab\j) B ^ 

where the qubits Ma in |x) are given to Alice and Mb, Y to Bob. It simplifies the descrip- 
tion of the protocol if Alice and Bob measure the first and the last register, respectively, 
of the shared state to get a common random index j G [n]. Since these registers will not 
be modified during the course of the protocol, the behavior of V' is not affected by this 
measurement. 

We are ready to describe the steps of the protocol V' . Given the inputs x, y to Sk-i, 

1. Ahce, who gets the input x, initializes a register X to \x) \ j), where |0) 
is the uniform superposition over Xk-i- 

Note that the state of the registers XMaMbY is now exactly as after the first message in 
a run of the protocol V on an input for Sk where a = j, all input registers Xi but for Xj 
are in uniform superposition over Xk-i, Xj — x, and all Yi are in uniform superposition 
over yk-i. 

2. Bob, who gets the input y, applies a unitary transformation Vj^y (to be defined below) 
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to the registers MbY . This step is intended to bring the state of the registers M^MbY 
close to |x(j/)), the state after the first message in a run of the protocol V on an input 
for Sk with X,Yi,Y2, . . . , ^+1, ■ ■ ■ ,Yn as above, except that register Yj is set to y 
rather than the uniform superposition over yu-i- Note that on an input as in |x(y)), the 
result of a protocol for Sk is expected to be the same as Sk-i{x, y). 

3. Alice and Bob now simulate the protocol V from the second message onwards starting 
with the registers XM^MbY , and declare the result of that procedure as the output of 
the protocol V' . 

The transformation Vj^y is defined as follows. Consider the state |x(j, y)) of the regis- 
ters MaMbY (analogous to |x)) obtained by running V till the first message is sent, when 
the register Y is initialized to \y) where is the uniform superposi- 

tion over yk-i- Let p = Ti MbY \x){x\, and pj^y = TtmbY \xU, y)){xU, y) \ be the restriction 
of the two states to Alice. The transformation Vj^y is defined as the local unitary opera- 
tor on MbY, given by Theorem II. 7, that achieves the fidelity between p and pj^y. This 
completes the description of V'. 

Observe that V' has k — 1 messages starting with Alice, and has complexity I. We now 
analyze its probability of error, under a uniform distribution on inputs. 

Bob's part of the input to Sk in |x) and |x(j, y)) differ only in the register Yj: in the 
first state, this is uniform over yk-i, whereas in the second state, this is set to y. Thus, 
the state |x) when restricted to Alice is the average encoding, over all y e ^jt-i, of the 
state |x(j, y)) restricted to her: 



The Average encoding theorem tells us that p and pj^y are close to each other on average, 
provided the mutual information /ij — I{Yj : Ma) between Alice's state and Yj in a run 
of V on the uniform distribution on all inputs is small: 



As in the proof of Lemma V.3, it is not hard to see that if the length £1 of the first 
message M is small relative to n, then for a random j, this mutual information is small. 



P 





(3) 
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Claim V.5: J^if^i < 2^1- Thus, E^- /x^ < 2£i/n. 

By Lemma II. 8, the transformation Vj^y maps \x) to a state close to |x(j, y)), and by 
Lemma II. 6 

II \Vj,yx){Vj,yx\ - \x{j,y)){x{j,y)\ lit 
< 2V2 h{\v^,yx){Vj,yx\:\xU:y)){xU:y)\) 
= 2V2 h{p,p,,y). (4) 

For a random y e yk-i, and a random j e [n], then, the average error in approximating 
the state \x{j,y)) is 

^j,y II \Vj,yx){Vj,yx\ - \x{j,y)){xU,y)\ lit 

< 2\/2 'Ej yh{p,pj y) From equation (4) 

< 2-\/2 Ej [Ej, /i^ (p, By Jensen's inequahty 

< 2v2 Ej I -^/Xj 1 From equation (3) 

< 2vln2 [Ej/Xj]^''^ By Jensen's inequahty 

< 3 (£i/n)^/^ From Claim V.5 

Running the protocol V on the input described in step 2 of V' finds Sk-i{x,y) with 
probability of error at most e on average when x, y are chosen at random. Thus, running 
the protocol V on the state resulting from step 2 of the protocol V gives us the answer 
to Sk~i{x,y) with average probability of error only slightly higher than e: 

= ^ + lEj,y\\\Vj,yX){Vj,yX\-\x{j,y)){xij,y)\\\t < e^2{t^lnfl\ 

as claimed. ■ 

For classical randomized protocols, it is possible to simplify the reduction of Sk^\ to Sk 
described above: This is accomplished as follows. Recall that Alice and Bob share public 
random coins. They use this to sample a (common) message m from the distribution over 
classical messages in the first round of the protocol V for 5"^, where the inputs are chosen 
uniformly at random. They also pick a common random index j e \p\. Alice now picks Xj, 
1^2 uniformly at random from and sets Xj = x, and a — j. Bob picks Yi, . . . , F„ 
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from the uniform distribution over ylzl x {y} x yk-i-> conditioned on the first message in 
the protocol V on such a random input being equal to m. The distance between the joint 
state so constructed and the joint state in the original protocol differs (in £i-distance) by 
at most the distance between Alice's marginal distributions. Alice and Bob now simulate 
the protocol V from the second message onwards on the input X, Y. A straightforward 
analysis using the Average encoding theorem shows that the initial state (consisting of 
the message and the inputs) constructed above differs from the corresponding state in 
the protocol V by only (2£i/n)^/^. This simpler argument was noted in Ref. [37] and 
independently in Ref. [38]. 

C. The Disjointness Problem 

We now investigate the bounded round complexity of the disjointness problem. Here 
Ahce and Bob each receive the incidence vector of a subset of a size n universe. They 
reject iff the sets are disjoint. It is known [39], [12] that Q^(DISJ) > (1 — H{e))n and 
Ql(DlSJ) > (1 - H{e))n/2. Furthermore gf/^3^^(DISJ) = O(v^logn) by an application 
of Grover search [1]. This upper bound was later improved [20] to 0{y/n), although the 
number of rounds remained 0{y/n). We now prove a lower bound by reduction. 

Proof: (Of Corollary 1.3): Suppose we are given a k round quantum protocol for the 
disjointness problem having error 1/3 and using c qubits. W.l.o.g. we can assume Bob 
starts the communication, because the problem is symmetrical, and that k is even. We 
reduce the communication problem Sk from Section V-A to DISJ. 

We visualize an instance of as defining a subtree of the n-ary tree with k + 1 levels 
and the edges at alternate levels known to Alice and Bob, respectively. The leaves of the 
tree are labelled by Boolean values known to Ahce (since k is even) . The only edge at the 
root connects it to the a'th child, where a G [n] is the input that specifies which instance 
of Sk-i is to be solved. The subtrees at the second level are defined recursively according 
to the n instances of Sk-i- 

There are at most possible paths of length k that could start at the root vertex. 
With each such path we associate an element in the universe for the disjointness problem. 
Given the edges originating from each of their levels, Alice and Bob construct an instance 
of DISJ on a universe of size N — n^. Alice checks for each possible path of length k 
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whether the path is consistent with her input and whether the paths lead to a leaf which 
corresponds to the bit 1. In this case she takes the corresponding element of the universe 
into her subset. Bob similarly constructs his subset. Now, if the two subsets intersect, 
then the (unique) element in the intersection witnesses a length k path leading to 1-leaf. 
If the subsets do not intersect, then the length k path from the root leads to a 0-leaf. 

We thus obtain a k round protocol for Sk in which Bob starts. By Lemma V.2, the 
communication c is Q{n) for any constant k. Since the input length for the constructed 
instance of DISJ is N ^ n^ we get Q*^/3(DISJ) = O(ArVfe) for k = 0(1). ■ 

D. Beyond A Constant Number Of Messages 

So far, we have discussed the complexity of solving Sk in the context of protocols with 
a constant number of messages. In fact, we may derive a meaningful lower bound even 
when k grows as a function of the parameter n (hence as a function oi N — n^, the input 
length). We may state the result as follows. 

Theorem V.6: For all k = k{n) > 1 and constant e < | we have Q\j^J^{Sk) = 
n{l + k). 

Proof: Let I = Q^7(5fe). Then, there is a protocol that achieves this communication 
complexity with ii,i2, ■ ■ ■ ,ik qubits of communication in the k rounds, respectively. By 
repeated application of Lemma V.4 there is a quantum protocol that solves Si with one 
message, the wrong player starting, ik communication qubits and error 



ei = e + 2 



^ //A 
1=1 ^ ' 



< e + 2 ^ ^i<k » j gy Jensen's inequality 

fki^ 

For a constant 5 G (e, |), if ^ < (^)^ f then ei < 6 and by Lemma V.3 we have 
£ > £k> ^— l^n. This imphes that k < ■ 2 ■ jzwis)- some 5 close enough to e 

we get k < 1. A contradiction. This proves that £ > f^(f)- Also, every k round protocol 
has at least k communication qubits and so £ > k. ■ 
Note that this lower bound of Q{n/k + k) also applies to classical randomized protocols. 
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The above theorem imphes a gap in communication complexity between k and k + 1 
message protocols for /c up to ©((n/ log n)^/^) = ©(log N/ log log N) , and also lower bounds 
for DISJ for such k. 



The pointer jumping function is considered in most results showing a round-hierarchy 
for classical communication complexity [6], [7], [9], [8]. This problem is a particularly 

natural candidate for such results. 

Definition VI. 1 (Pointer Jumping) Let Va and Vb be disjoint sets of n vertices each. 



Define /(o)(f) = v and f^\v) = f{f'^''-^\v)). 

Then Qk-^A^^B-^ (Va U Vb) is defined by gk{fA, Ib) = ff^fliPi), where Vi G Va is 
fixed. The pointer jumping function fk '■ x J'b {0, 1} is the XOR of all the bits in 
the output of gk- 

In the corresponding communication problem, Alice is given a function Ja £ Ta-, and Bob 
a function Jb £ T'Bi and they are required to compute fk{fA,fB)- 

A. Previous Work 

If Alice starts, fk has a deterministic k round communication complexity of k log n. If 
Bob starts, Nisan and Wigderson [7] proved that fk has a randomized k round communi- 
cation complexity of ^{fj — klogn). The lower bound can also be improved to + k), 
see Klauck [39]. With techniques similar to the ones in this section it is also possible to 
show a lower bound of ^^~2^i " — k log n for the randomized k round complexity of fk when 
Bob starts. We omit the details. 

The lower bounds are not far from the known upper bound. Nisan and Wigderson [7] 
describe a randomized protocol for computing gk with complexity 0(|logn -|- klogn) in 
the situation where Bob starts and k rounds are allowed. Ponzio et al. [9] show that when 
A; = 0(1), the deterministic communication complexity of fk is 0{n). 



VI. The Pointer Jumping Function 



Let Ta = {/aI/a : Va ^ Vb}, and Tb = : Vb ^ Va}, and 
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B. A New Upper Bound 

We first give a new classical upper bound which combines ideas from Nisan and Wigder- 
son [7] and Ponzio et al. [9]. For n> 1, define log*^^''(n) = logn and for A; > 1, define 

log('=)(n) = log(max{log('=-')(n),l}). 

Furthermore let log*(n) = min {k : log^''\n) < 1}. 

Theorem VI. 1: R^'^igk) < 0{k\ogn + f • log^ • (log(r'=/2l)(n) + log A;)). 
Proof: The claim is trivial for k — 1. 

For greater k Bob starts and we have the following protocol. At the first round Bob 
guesses (with public random bits) a set 5*0 of Sn random vertices from Vg, we specify 6 later. 
For each chosen vertex v Bob communicates the first £o bits of fsiv), we specify £o later. 
Note that the names of the chosen vertices are accessible to Alice without communication, 
by reading the public random bits. The protocol then proceeds in two stages. 

• Denote Vt — f^^~^\vi). For each round i — the active player sends Vi. I.e., 
at the first round Bob sends nothing (as vi is known), at the second round Alice sends 
V2 = fi'^i): then Bob sends /(f2) and so on. Also, at each round i Alice checks whether 
Vi & Sq. Let t be the first round in which this happens. If t > | the two players abort the 
protocol. 

• The rounds t,t + 1, . . . ,k take a special form. Let us start with round t. Alice knows 
Vt e So and therefore knows the first £o bits of fB{vt)- Alice defines a set Si that contains 
all elements of Va with that prefix. I.e., IS*!] < ^ and Vt+i = f{vt) G Si. For each v e Si 
Alice sends the first ii bits of fA{v). In general, in the (t + i)'th round the active player 
knows ii bits of f{vt+i). The active player then defines a set Si^i that contains all the 
elements of his side with that prefix. I.e., |<S'j+i| < ^ and Vt+i+i — f{vt+i) G Si+i. For 
each V e the active player sends the first ii+i bits of f{v). 

We now specify the parameters. First we choose S — ^InK W.l.o.g. we can assume the 
vertices V2,V4, . . . are all distinct, or Alice can easily save two rounds and the players finish 
on time. For any choice of | distinct vertices f 2, . . . , Vk/2 the probability, over the choice of 
5*0, that during the first | rounds Alice will not visit 5*0 is at most (1 — ^)^" < < e. 
So assume indeed that t < ^. 
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We now chose ii — log*-'^*^/^^"*^ n + 3 log A;. It follows that for some i < | we have 
£i > \ogn and \Si\ — 1 and the active player who holds Vf+i also knows f{vt+i), so he can 
save two rounds and the computation ends on time. 

We now count the number of communication bits. We need klogn bits for communi- 
cating Vi, i = 1, . . . , fc. Also, we need J2i=o'^ \Si\ii bits for communicating the first ii bits 
of each element in Si. Notice, however, that £i < and so: 

\k/2] \k/2-\ ^ ^ \k/2-] 



r 

,n , 1 



J2 m^ < n[sio + E ^] ^ ^['^^o + ^ E 1] 

i=0 i=l i=l 



0{j • log - • (log(^*^/2^) n + log A;)) 

which completes the proof. ■ 
Corollary VI. 2: lik> 21og*(n) then R^f^^gk) < 0((f + A;)logA;). 

C. A Lower Bound On The Quantum Communication Complexity 

In this section we prove a lower bound on the quantum communication complexity of 
the pointer jumping function f^, for the situation that k rounds are allowed and Bob 
sends the first message. The proof uses the same ingredients as the proof of the lower 
bound for the function Sk in Theorem V.l, namely the Average Encoding Theorem and 
the Local Transition Lemma. We will consider a quantity dt capturing the information 
the active player has in round t on vertex t + 1 of the path. This quantity will be the 
informational distance between the active player's qubits and vertex t + 1. Our goal will 
be to bound df in terms of d^^i (which is the information gain so far) plus a term related 
to the average information on pointers in the other player's input (which is low as long as 
the number of qubits sent is small). This leads to a recursion imposing a lower bound on 
the communication complexity, since in the end the protocol must have reasonably large 
information to produce the output, and in the beginning the corresponding information do 
is 0. 

Let Alice be active in the {t+iyth round. The informational distance dt+i measures the 
distance between the state of, say, Alice's qubits together with the next vertex -Fb(T4+i) 
of the path, and the tensor product of the states of Alice's qubits and Fb(T4+i). In 
the product state Alice has no information about FB(Vt+i), so if the two states are close 
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Alice's powers to say something about the vertex are very hmited. We will use the triangle 
inequality to bound df+i by the sum of three intermediate distances. In the first step we 
move from the state given by the protocol to a state in which the {t + l)'th vertex is 
replaced by a uniformly random vertex, independent of previous communications. The 
penalty we have to pay for that is proportional to dt which is a bound on the amount 
of information Bob gained on Vt+i. Wc use the local transition lemma to conceal Bob's 
ability to detect such a replacement. Once the (t + l)'th vertex is random, we deal with 
the average information a player (Bob) can get on a random pointer in the other player's 
input, and this term is small when the number of communicated qubits is small. The last 
step is similar to the first and reverses the first one's effect, i.e., replaces the "randomized" 
{t + l)-th vertex by its real value again. We arrive at the desired product state. 

Theorem VI. 3: Q^/f (/fe) > ^um - /clog n. 
Note that the lower bound is linear in n for constant k and leads to Theorem 1.2. It 
implies a separation between the k and k + 1 round complexity of Pointer Jumping for k 
upto ©(logn) = ©(log A/"), where N — nlogn is the input size. 

Proof: (of Theorem VI. 3) Fix a quantum protocol for with probabihty of error |, 
k rounds, and with Bob starting. Usually a protocol gets some classical /a and /b as 
inputs, but we will investigate what happens if the protocol is started on a superposition 
over all inputs, in which all inputs have the same amplitude, i.e., on 

Note that \Ta\ — \^b\ — n"'. The superposition over all inputs is measured after the 
protocol has finished, so that a uniformly random input and the result of the protocol on 
that input are produced. 

We also require that before round t the active player computes and measures the vertex 
Vt = f^^~^\vi), and includes it in the message that is sent to the other player, who stores 
it in some qubits Vt. Thus, at the first round Bob sends vq (which is known in advance) 
to Alice, at the second round Alice sends V2 — Fa{vi) to Bob and so on. This increases 
the communication by an additive klogn term. Notice that Fa,Fb are in a uniform 
superposition over all possible inputs, and so if we don't measure Fa and Fb the register 
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Vi is also in a uniform superposition for every i > 1. The density matrix of the global state 
of the protocol before the communication of round t is tPAFsi where Fa, Fb are the 

qubits holding the inputs of Ahce and Bob and MA,t resp. MB,t are the other qubits in the 
possession of Alice and Bob before the communication of round t. The state of the latter 
two systems of qubits may be entangled. In the beginning these qubits are independent 
of the input. We also denote pMAtMs iFaFb density matrix of the system in the case 
where we do not measure any of the Vi. 

Let us denote dt — D'^{MB,tFB '■ FA{Vt)) when t is odd, where the register FA{Vt) 
has been measured. Notice that at this stage Vt is measured and is a subregister 

of Fa- The quantity dt is a measure of Bob's information on the value Fa{v) Ahce is 
going to compute. We similarly let df — D^{MA,tFA '■ FB{Vt)) when t is even, where the 
register FsiVt) has been measured. 

We assume that the communication complexity of the protocol is 6n and prove a lower 
bound S > 2~'^^^\ The general strategy of the proof is induction over the rounds, to 
successively bound c?i, ^2, ■ ■ ■ , c^fe+i- Bob sends the first message. As Bob has seen no 
message yet, we have that I{Mb,iFb : Fa{Vi)) — 0, and hence di — 0. We show that 

Lemma VI. 4: dt+i < 8dt + 45. 

We see that dt+i < 9*5 for all t > 0. After round k one player, say Alice, announces 
the result which is supposed to be the parity of Fb(Va;+i) and included in MA^k+i- On the 
one hand dk+i = D'^{MA,k+i ■ -PslVfe+i)) < On the other hand, by Lemma 111.4(2) 
D\MA,k+i : FB{Vk+i)) > 1/8 - 1/16 = 1/16. Together, ^ < 9^=5, so 5 > 2-0^^1 ■ 

We now turn to proving Lemma VI. 4. 

Proof: (Of Lemma VI.4): W.l.o.g. let Alice be active in round t+1. Let Ma = MA,t+i 
and Mb = MB,t+i- Before the t+1 round = FA{Vt) is measured. The resulting state is 
a probabilistic ensemble over the possibilities to fix Vi, . . . , Vt+i, which are then classically 
distributed. Alice's reduced state is block diagonal with respect to the possible values of 
the vertices T^, . . . , V+i. For any value v of V+i let Pm^MbFaFb = Pm^mIfaFb denote the 
pure state with vertex l^+i fixed to v. Our first goal is to bound the amount of information 
Bob has at this stage about Ahce's value Vt+\. We define: 
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Iv — ^ {PMbFb^ PMbFb) ■ 

I.e., we look at Bob's view before the t-\-l message, and in particular before Alice sends 
Vt-\-i to him, and we let 7^, measure how much Bob's view when = v differs from Bob's 
average view. We show that these two are typically close to each other, namely: 

Lemma VI. 5: £^7^, < dt. 
Loosely speaking this says that Bob does not know more than dt units of information 
about Fa- 

The next step is to replace the actual state Pm^MbFaFb where Vt+i — v with the average 
case PmaMbFaFbR where nothing is known about Vj+i. As we saw, typically. Bob can not 
distinguish between the actual encoding and the average one, so this should not matter 
much to Bob. We let PmaMbFaFbR ^ purification of PmaMbFaFb where R is some addi- 
tional space used to purify the random path Vi, . . . ,Vt. I.e., Pm^MbFaFbR ^^cflects a purifi- 
cation of Bob's view, when — v. We let PmaMbFaFbR be a purification of PmaMbFaFb 
where R is some additional space used to purify the random path Now, 
due to Lemma II. 8 there is a local unitary transformation acting only on FaMaR 
such that (jIi^MbFaFbR U^pmaMbFaFbrUI, and Pm^MbFaFbR close to each other. 
^MaMbFaFbR I'eflects a purification of Bob's average view with Alice locally adding Vt+i = v 
to it . Notice that in ctm^MbFaFbR' ^ arbitrary and in particular can be different than 
Vf+i. By Lemma II. 8 for all vertices v & Vb, 

{PMaFa ' ^MaFa ) - {PMaFaFb (v) ' ^IiaFaFb {v) ) 

< {PMaMbFaFbR^ ^MaMbFaFbr) 

= {PMbFb^PMbFb) ^Iv, (5) 

We are interested in the value 

dt+i = D''{MAFA:FB{yt+i)) = h^{pMAFAFB{v).PMAFA®PFB{v)). 

where Fb{v) is measured and the expectation is over the uniform distribution on ver- 
tices V. We now study this expression under the average case scenario, i.e., we look at 

^MaFaFb{v)^ ^MaFa ® PFb{v) 

)■ We prove: 
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Lemma VI. 6: For all vertices v &Vb, 



where, 



dt+l (v) = {pMaFaFb (v) , PMaFa ® PFb {v) ) , 



(6) 



where Fb{v) is assumed to have been measured. Recall that in p we let T^, ... , Vt+i go 
unmeasured and that v is an arbitrary value not necessarily equal to Vt+i. We then prove: 



Lemma VL7: Ey dt+i{v) < 2S. 

Assuming the above lemma, we see that for all v: 

h {PMaFaFb{v)^ PMaFa PFb{v)) 
^ h {PMaFaFb (v) ' ^MaFaFb (v) ) 

+ ^ {(^MaFaFb{v)' ^MaFa ® PFb{v)) 
+ h {cTm^Fa ® PFb{v),PMaFa ® PFb{v)) 
< 2^ + h {(Tm^FaFb(v)^ ^MaFa ® PFb{v)) 



< 2y/^ + dt+i{v) 



Prom equation (5) 
From Lemma (VI. 6). 



Squaring both sides, 

{PMAFAFBiv) ' PMaFa ® PFBiv)) < (^2^ + \Jdt+l{v) 

< 8^y + 2dt+i(v). 



(7) 



I.e., we paid an 87^, penalty, and we switched to the scenario where Bob has no information 
about Vt+i. Now, 

D\MaFa : FsiVt+i)) - E, {pI,,f,Fb{v), PmaFa ® PFb{v)) 

< By [87^ + 2dt+i{v)] By equation (7) 

< 8dt + 4:6 By Lemma VL7. 

This completes the proof of Lemma VL4. ■ 
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We finish the proof of Theorem VI. 3 by proving the remaining Lemmas. 

Proof: (Of Lemma VL5): By definition E^7„ is E^h^ {pMBF^FAiuy PmbFb ® P^ah) = 
D\MbFb : FA{Vt)). Now, D\MB,t+iFB : F^(V^)) < D\MB,tFB : FA(Vt)) = dt because 
Bob sends the Vih message, and this only decreases the informational distance. ■ 

Proof: (Of Lemma VL 6): 



{'^MaFaFb{v)^ '^MaFa ® PFb{v)) 
< h'^ {c^MaFaRFb (v) > ^IiaFaR ® PPb (f ) ) 

= h'^ {pmaFaRFb (v) , PmaFaR ® Pfb {v) ) By unitarity 

= h'^ {PMaFaFb (v) , PMaFa ® PFb (v) ) (8) 

= dt+i{v) By definition (6). 

For equation (8), notice that R holds the path Vi, . . . , T^+i, which is determined by MaFa- 
We can apply a unitary transformation that "erases" this. We then get a pure state that 
is p with Vi, . . . , Vf+i unmeasured, i.e., what we called p ■ 
Proof: (Of Lemma VI. 7): We first bound the information Ahce has on Bob's input. 

For all t, I{MA,tFA '■ Fb) is bounded above by twice the number of qubits in the messages 
so far due to Lemma 11.10, assuming that Fb is measured, i.e., I^Ma^iFa '■ Fb) < 2Sn. 
Thus considering the situation that Fg is distributed uniformly instead of being in the 
uniform superposition we get E„7(MaFa : Fb{v)) < 25 (where v is uniformly random), 
using Equation (1) and that the Fb{v) are mutually independent. Now, 

E^dt+i{v) = E^h? [pmaFaFb{v)^PmaFa® PFb{v)) 
= E,D\MaFa:Fb{v)), 

where FaMaMbFb are as in the protocol without measurements. Also I{MaFa : Fb{v)) 
is invariant if FB{i) is in superposition or measured for i ^ v. So, 



E^D^{MaFa: Fb{v)) < E^ I{MaFa : Fb{v)) By Lemma III.3 

= 25. 
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